Convolutional Neural Networks

Standard Imports

Data

What the Data is to our computer

"Convolutional" Kernels

What are "Kernels"?

  1. It's just a set of linear transformations; so
  2. It can be represented as a matrix multiplication; this also means
  3. The partial derivatives (and hence the gradient) of the weights is easy to calculate
  4. And can again be represented as a matrix multipliction
$$\huge \begin{align*} Y'_{ij} = {}& g_{ijk}(X)\\ = {}& \sum_{c}\sum_{i_0 = 0:kw}\sum_{j_0 = 0:kw} X_{(i+i_0)(j+j_0)c} K_{i_0j_0k_0}\\ Y_{ij} = {}& f(g_{ijk}(X)) \\ \frac{\partial}{\partial K_{abk}} Y_{ij} = {} & \frac{Y_{ij}}{\partial Y'_{ij}} \frac{\partial Y'_{ij}}{\partial K_{abk}}\\ = {}& \frac{Y_{ij}}{\partial Y'_{ij}} \sum_{c} X_{(i+a)(j+b)c} \\ \frac{\partial}{\partial K_{abk}} \sum_{i,j} Y_{ij} = {} & \sum_{i,j} \frac{Y_{ij}}{\partial Y'_{ij}} \frac{\partial Y'_{ij}}{\partial K_{abk}}\\ = {}& \sum_{i,j} \frac{Y_{ij}}{\partial Y'_{ij}} \sum_{c} X_{(i+a)(j+b)c} \\ \end{align*}$$

(Max) Pooling

Coding Lecture

After reading ch 9 you should be aware of

the key details and specifics of CNNs

what the Benefits of CNNs are

the Interpretation of CNNs as a Bayesian prior

what convolutions are

some further considerations regarding convolutions

how very computationally demanding CNNs are